Skip to content

Handle req-resp responses with multiple chunks#407

Merged
MegaRedHand merged 2 commits into
lambdaclass:mainfrom
dicethedev:fix/req-resp-multi-chunk-responses
Jun 3, 2026
Merged

Handle req-resp responses with multiple chunks#407
MegaRedHand merged 2 commits into
lambdaclass:mainfrom
dicethedev:fix/req-resp-multi-chunk-responses

Conversation

@dicethedev
Copy link
Copy Markdown
Contributor

🗒️ Description / Motivation

  • Fixes req-resp payload decoding so multi-chunk responses are handled correctly.
  • decode_payload previously read the entire remaining stream, causing BlocksByRange / multi-block responses to fail when the first chunk decoder consumed bytes belonging to following chunks.
  • This solves snappy decode failures for responses containing multiple payload chunks.

What Changed

  • Updated crates/net/p2p/src/req_resp/encoding.rs.
  • Reworked decode_payload to read one varint-prefixed snappy payload at a time.
  • Added async helpers for reading the varint prefix and one snappy frame.
  • Added tests covering adjacent payloads, large snappy-framed payloads, and empty payload boundaries.

Correctness / Behavior Guarantees

  • Each response chunk is decoded independently.
  • The next response code / payload remains unread for the outer response loop.
  • Existing maximum payload and compressed payload size checks are preserved.
  • Response size metrics now report the compressed size for the current payload only.

Tests Added / Run

  • Added req-resp encoding tests.
  • Ran:
    • cargo fmt --check
    • cargo test -p ethlambda-p2p

Related Issues / PRs

✅ Verification Checklist

  • [ x] Ran make fmt — clean
  • [ x] Ran make lint (clippy with -D warnings) — clean
  • [ x] Ran cargo test --workspace --release — all passing

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 1, 2026

Greptile Summary

This PR fixes multi-chunk req-resp decoding by reworking decode_payload to read one varint-prefixed snappy frame at a time instead of slurping the entire stream, and adds BlocksByRange long-range sync triggered from status responses. The encoding fix is well-implemented and tested; however the new range-sync retry path in handle_retry_range_sync contains a logic bug that silently disables retries.

  • encoding.rs: read_varint + read_snappy_frame helpers correctly consume exactly one payload per call, leaving subsequent chunks untouched; three new tests cover adjacent payloads, large frames, and empty-payload boundaries.
  • handlers.rs / lib.rs: Adds request_blocks_by_range_from_peer, response chaining, and retry scheduling — but handle_retry_range_sync checks pending_range_requests.contains(&(start_slot, total_end_slot)) with a key that is never stored for multi-batch ranges (stored key is (effective_start, batch_end)), and the variable name is inverted, causing all range-sync retries to be skipped unconditionally.

Confidence Score: 3/5

The encoding fix is correct and tested, but the range-sync retry path is broken for multi-batch ranges, causing failures to be silently dropped.

The encoding fix is sound and well-tested. The concern is handle_retry_range_sync: the deduplication key (start_slot, total_end_slot) is never stored for multi-batch ranges, so the guard always fires and retries are always skipped. Long-range sync will silently stall after any first-batch failure.

crates/net/p2p/src/lib.rs (handle_retry_range_sync) and crates/net/p2p/src/req_resp/handlers.rs

Important Files Changed

Filename Overview
crates/net/p2p/src/req_resp/encoding.rs Core fix: decode_payload now reads one varint-prefixed snappy payload at a time via read_varint + read_snappy_frame helpers; tests cover multi-chunk, large, and empty-payload cases.
crates/net/p2p/src/req_resp/handlers.rs Adds BlocksByRange handling and range-sync retry; retry guard in handle_retry_range_sync is broken for multi-batch ranges, plus a redundant remove call.
crates/net/p2p/src/lib.rs Adds pending_range_requests, PendingRequestKind, and handle_retry_range_sync; retry guard contains the inverted/wrong-key bug.
crates/net/p2p/src/req_resp/mod.rs Re-exports new functions; no issues.
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
crates/net/p2p/src/lib.rs:436-448
**Retry guard is inverted and uses a stale/wrong key — range sync retries never fire**

`still_needed` is named backwards: it is `true` when `pending_range_requests` does NOT contain the key, yet the function returns early (`return`) precisely when `still_needed` is `true`, so it _skips_ the retry when it should proceed and _proceeds_ when it should skip. More critically, the lookup key is `(start_slot, total_end_slot)`, but `pending_range_requests` stores `(effective_start, batch_end)` where `batch_end = min(total_end_slot, effective_start + MAX_REQUEST_BLOCKS − 1)`. For any multi-batch range, `batch_end < total_end_slot`, so the lookup always returns `false`, `still_needed` is always `true`, and the retry is unconditionally skipped. Range sync failures silently go unretried.

### Issue 2 of 2
crates/net/p2p/src/req_resp/handlers.rs:330-340
Redundant removal — `pending_range_requests` is already cleared by the caller before `handle_blocks_by_range_response` is invoked, making this a guaranteed no-op that adds confusion about which site is authoritative.

```suggestion
async fn handle_blocks_by_range_response(
    server: &mut P2PServer,
    blocks: Vec<SignedBlock>,
    peer: PeerId,
    start_slot: u64,
    end_slot: u64,
    total_end_slot: u64,
) {
    // Note: (start_slot, end_slot) has already been removed from
    // pending_range_requests by the caller before this function is invoked.
```

Reviews (1): Last reviewed commit: "fix req-resp multi-chunk response decodi..." | Re-trigger Greptile

Comment thread crates/net/p2p/src/lib.rs Outdated
Comment thread crates/net/p2p/src/req_resp/handlers.rs Outdated
Comment thread crates/net/p2p/src/req_resp/encoding.rs Outdated
@MegaRedHand
Copy link
Copy Markdown
Collaborator

Can you rebase f4b3d5e on main? That way we can review and merge that independently of #351

@dicethedev dicethedev force-pushed the fix/req-resp-multi-chunk-responses branch from 6ae58ff to 90891dd Compare June 1, 2026 22:41
@dicethedev dicethedev requested a review from MegaRedHand June 1, 2026 22:42
Comment thread crates/net/p2p/src/req_resp/encoding.rs Outdated
@dicethedev dicethedev force-pushed the fix/req-resp-multi-chunk-responses branch from 18035de to c07324e Compare June 3, 2026 06:13
@dicethedev dicethedev requested a review from MegaRedHand June 3, 2026 06:15
Copy link
Copy Markdown
Collaborator

@MegaRedHand MegaRedHand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MegaRedHand MegaRedHand merged commit 4594d4f into lambdaclass:main Jun 3, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle req-resp responses with multiple chunks

2 participants